<!DOCTYPE html>

<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta http-equiv="x-ua-compatible" content="ie=edge">
    
    <title>2.2.2. U2I召回 &#8212; FunRec 推荐系统 0.0.1 documentation</title>

    <link rel="stylesheet" href="../../_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/sphinx_materialdesign_theme.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/fontawesome/all.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/fonts.css" type="text/css" />
    <link rel="stylesheet" type="text/css" href="../../_static/pygments.css" />
    <link rel="stylesheet" type="text/css" href="../../_static/basic.css" />
    <link rel="stylesheet" type="text/css" href="../../_static/d2l.css" />
    <script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
    <script src="../../_static/jquery.js"></script>
    <script src="../../_static/underscore.js"></script>
    <script src="../../_static/_sphinx_javascript_frameworks_compat.js"></script>
    <script src="../../_static/doctools.js"></script>
    <script src="../../_static/sphinx_highlight.js"></script>
    <script src="../../_static/d2l.js"></script>
    <script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
    <link rel="index" title="Index" href="../../genindex.html" />
    <link rel="search" title="Search" href="../../search.html" />
    <link rel="next" title="2.2.3. 总结" href="3.summary.html" />
    <link rel="prev" title="2.2.1. I2I召回" href="1.i2i.html" /> 
  </head>
<body>
    <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header mdl-layout--fixed-drawer"><header class="mdl-layout__header mdl-layout__header--waterfall ">
    <div class="mdl-layout__header-row">
        
        <nav class="mdl-navigation breadcrumb">
            <a class="mdl-navigation__link" href="../index.html"><span class="section-number">2. </span>召回模型</a><i class="material-icons">navigate_next</i>
            <a class="mdl-navigation__link" href="index.html"><span class="section-number">2.2. </span>向量召回</a><i class="material-icons">navigate_next</i>
            <a class="mdl-navigation__link is-active"><span class="section-number">2.2.2. </span>U2I召回</a>
        </nav>
        <div class="mdl-layout-spacer"></div>
        <nav class="mdl-navigation">
        
<form class="form-inline pull-sm-right" action="../../search.html" method="get">
      <div class="mdl-textfield mdl-js-textfield mdl-textfield--expandable mdl-textfield--floating-label mdl-textfield--align-right">
        <label id="quick-search-icon" class="mdl-button mdl-js-button mdl-button--icon"  for="waterfall-exp">
          <i class="material-icons">search</i>
        </label>
        <div class="mdl-textfield__expandable-holder">
          <input class="mdl-textfield__input" type="text" name="q"  id="waterfall-exp" placeholder="Search" />
          <input type="hidden" name="check_keywords" value="yes" />
          <input type="hidden" name="area" value="default" />
        </div>
      </div>
      <div class="mdl-tooltip" data-mdl-for="quick-search-icon">
      Quick search
      </div>
</form>
        
<a id="button-show-source"
    class="mdl-button mdl-js-button mdl-button--icon"
    href="../../_sources/chapter_1_retrieval/2.embedding/2.u2i.rst.txt" rel="nofollow">
  <i class="material-icons">code</i>
</a>
<div class="mdl-tooltip" data-mdl-for="button-show-source">
Show Source
</div>
        </nav>
    </div>
    <div class="mdl-layout__header-row header-links">
      <div class="mdl-layout-spacer"></div>
      <nav class="mdl-navigation">
          
              <a  class="mdl-navigation__link" href="https://funrec-notebooks.s3.eu-west-3.amazonaws.com/fun-rec.zip">
                  <i class="fas fa-download"></i>
                  Jupyter 记事本
              </a>
          
              <a  class="mdl-navigation__link" href="https://github.com/datawhalechina/fun-rec">
                  <i class="fab fa-github"></i>
                  GitHub
              </a>
      </nav>
    </div>
</header><header class="mdl-layout__drawer">
    
          <!-- Title -->
      <span class="mdl-layout-title">
          <a class="title" href="../../index.html">
              <span class="title-text">
                  FunRec 推荐系统
              </span>
          </a>
      </span>
    
    
      <div class="globaltoc">
        <span class="mdl-layout-title toc">Table Of Contents</span>
        
        
            
            <nav class="mdl-navigation">
                <ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_preface/index.html">前言</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_installation/index.html">安装</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_notation/index.html">符号</a></li>
</ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../chapter_0_introduction/index.html">1. 推荐系统概述</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/1.intro.html">1.1. 推荐系统是什么？</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/2.outline.html">1.2. 本书概览</a></li>
</ul>
</li>
<li class="toctree-l1 current"><a class="reference internal" href="../index.html">2. 召回模型</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../1.cf/index.html">2.1. 协同过滤</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/1.itemcf.html">2.1.1. 基于物品的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/2.usercf.html">2.1.2. 基于用户的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/3.mf.html">2.1.3. 矩阵分解</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/4.summary.html">2.1.4. 总结</a></li>
</ul>
</li>
<li class="toctree-l2 current"><a class="reference internal" href="index.html">2.2. 向量召回</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="1.i2i.html">2.2.1. I2I召回</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">2.2.2. U2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="3.summary.html">2.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../3.sequence/index.html">2.3. 序列召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/1.user_interests.html">2.3.1. 深化用户兴趣表示</a></li>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/2.generateive_recall.html">2.3.2. 生成式召回方法</a></li>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/3.summary.html">2.3.3. 总结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_2_ranking/index.html">3. 精排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/1.wide_and_deep.html">3.1. 记忆与泛化</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/index.html">3.2. 特征交叉</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/1.second_order.html">3.2.1. 二阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/2.higher_order.html">3.2.2. 高阶特征交叉</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/3.sequence.html">3.3. 序列建模</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/index.html">3.4. 多目标建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/1.arch.html">3.4.1. 基础结构演进</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/2.dependency_modeling.html">3.4.2. 任务依赖建模</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/3.multi_loss_optim.html">3.4.3. 多目标损失融合</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/index.html">3.5. 多场景建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/1.multi_tower.html">3.5.1. 多塔结构</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/2.dynamic_weight.html">3.5.2. 动态权重建模</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_3_rerank/index.html">4. 重排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/1.greedy.html">4.1. 基于贪心的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/2.personalized.html">4.2. 基于个性化的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/3.summary.html">4.3. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_4_trends/index.html">5. 难点及热点研究</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/1.debias.html">5.1. 模型去偏</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/2.cold_start.html">5.2. 冷启动问题</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/3.generative.html">5.3. 生成式推荐</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/4.summary.html">5.4. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_5_projects/index.html">6. 项目实践</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/1.understanding.html">6.1. 赛题理解</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/2.baseline.html">6.2. Baseline</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/3.analysis.html">6.3. 数据分析</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/4.recall.html">6.4. 多路召回</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/5.feature_engineering.html">6.5. 特征工程</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/6.ranking.html">6.6. 排序模型</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_appendix/index.html">7. Appendix</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_appendix/word2vec.html">7.1. Word2vec</a></li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_references/references.html">参考文献</a></li>
</ul>

            </nav>
        
        </div>
    
</header>
        <main class="mdl-layout__content" tabIndex="0">

	<script type="text/javascript" src="../../_static/sphinx_materialdesign_theme.js "></script>
    <header class="mdl-layout__drawer">
    
          <!-- Title -->
      <span class="mdl-layout-title">
          <a class="title" href="../../index.html">
              <span class="title-text">
                  FunRec 推荐系统
              </span>
          </a>
      </span>
    
    
      <div class="globaltoc">
        <span class="mdl-layout-title toc">Table Of Contents</span>
        
        
            
            <nav class="mdl-navigation">
                <ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_preface/index.html">前言</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_installation/index.html">安装</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_notation/index.html">符号</a></li>
</ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../chapter_0_introduction/index.html">1. 推荐系统概述</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/1.intro.html">1.1. 推荐系统是什么？</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/2.outline.html">1.2. 本书概览</a></li>
</ul>
</li>
<li class="toctree-l1 current"><a class="reference internal" href="../index.html">2. 召回模型</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../1.cf/index.html">2.1. 协同过滤</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/1.itemcf.html">2.1.1. 基于物品的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/2.usercf.html">2.1.2. 基于用户的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/3.mf.html">2.1.3. 矩阵分解</a></li>
<li class="toctree-l3"><a class="reference internal" href="../1.cf/4.summary.html">2.1.4. 总结</a></li>
</ul>
</li>
<li class="toctree-l2 current"><a class="reference internal" href="index.html">2.2. 向量召回</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="1.i2i.html">2.2.1. I2I召回</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">2.2.2. U2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="3.summary.html">2.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../3.sequence/index.html">2.3. 序列召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/1.user_interests.html">2.3.1. 深化用户兴趣表示</a></li>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/2.generateive_recall.html">2.3.2. 生成式召回方法</a></li>
<li class="toctree-l3"><a class="reference internal" href="../3.sequence/3.summary.html">2.3.3. 总结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_2_ranking/index.html">3. 精排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/1.wide_and_deep.html">3.1. 记忆与泛化</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/index.html">3.2. 特征交叉</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/1.second_order.html">3.2.1. 二阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/2.feature_crossing/2.higher_order.html">3.2.2. 高阶特征交叉</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/3.sequence.html">3.3. 序列建模</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/index.html">3.4. 多目标建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/1.arch.html">3.4.1. 基础结构演进</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/2.dependency_modeling.html">3.4.2. 任务依赖建模</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/4.multi_objective/3.multi_loss_optim.html">3.4.3. 多目标损失融合</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/index.html">3.5. 多场景建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/1.multi_tower.html">3.5.1. 多塔结构</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_2_ranking/5.multi_scenario/2.dynamic_weight.html">3.5.2. 动态权重建模</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_3_rerank/index.html">4. 重排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/1.greedy.html">4.1. 基于贪心的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/2.personalized.html">4.2. 基于个性化的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/3.summary.html">4.3. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_4_trends/index.html">5. 难点及热点研究</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/1.debias.html">5.1. 模型去偏</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/2.cold_start.html">5.2. 冷启动问题</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/3.generative.html">5.3. 生成式推荐</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/4.summary.html">5.4. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_5_projects/index.html">6. 项目实践</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/1.understanding.html">6.1. 赛题理解</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/2.baseline.html">6.2. Baseline</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/3.analysis.html">6.3. 数据分析</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/4.recall.html">6.4. 多路召回</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/5.feature_engineering.html">6.5. 特征工程</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/6.ranking.html">6.6. 排序模型</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_appendix/index.html">7. Appendix</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_appendix/word2vec.html">7.1. Word2vec</a></li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_references/references.html">参考文献</a></li>
</ul>

            </nav>
        
        </div>
    
</header>

    <div class="document">
        <div class="page-content" role="main">
        
  <section id="u2i">
<span id="id1"></span><h1><span class="section-number">2.2.2. </span>U2I召回<a class="headerlink" href="#u2i" title="Permalink to this heading">¶</a></h1>
<p>在完成了I2I召回的探讨后，我们转向另一条同样重要的技术路径：U2I（用户到物品）召回。如果说I2I召回解决的是“买了这个商品的人还会买什么”的问题，那么U2I召回直面的则是推荐系统的核心命题——“这个用户会喜欢什么商品”。</p>
<p>U2I召回的核心挑战在于如何在庞大的物品库中，快速找到与用户兴趣高度匹配的候选集。传统的协同过滤方法虽然有效，但在面对数亿用户和数千万商品时，计算复杂度成为不可逾越的障碍。U2I召回的演进历程，本质上是一个将复杂的“匹配”问题逐步简化为高效“搜索”问题的过程。</p>
<p>这一转变的关键突破来自于一个统一的架构思想：<strong>双塔模型（Two-Tower
Model）</strong>。无论是经典的因子分解机FM、深度结构化语义模型DSSM，还是YouTube的深度神经网络YouTubeDNN，它们在表面上看起来差异巨大，但在本质上都遵循着同一个设计哲学——将用户和物品分别编码为向量，然后通过向量间的相似度计算来衡量匹配程度。</p>
<p>双塔模型的核心思想是将推荐问题分解为两个相对独立的子问题。<strong>用户塔（User
Tower）</strong>专注于理解用户——处理用户的历史行为、人口统计学特征、上下文信息等，最终输出一个代表用户兴趣的向量<span class="math notranslate nohighlight">\(u\)</span>。<strong>物品塔（Item
Tower）</strong>则专精于刻画物品——整合物品的ID、类别、属性、内容特征等，输出一个表征物品特性的向量<span class="math notranslate nohighlight">\(v\)</span>。</p>
<p>这种“分而治之”的设计带来了巨大的工程优势。在训练完成后，所有物品的向量都可以<strong>离线预计算</strong>并存储在高效的向量检索系统中（如Faiss、Annoy等）。当用户发起推荐请求时，系统只需实时计算用户向量，然后通过<strong>近似最近邻（ANN）搜索</strong>快速找到最相似的物品向量。这种架构的优雅之处在于，它将原本需要<span class="math notranslate nohighlight">\(O(U \times I)\)</span>的用户-物品匹配复杂度，降低到了<span class="math notranslate nohighlight">\(O(U + I)\)</span>的向量计算复杂度。</p>
<p>用户与物品的匹配度通过两个向量的<strong>点积</strong>或<strong>余弦相似度</strong>来衡量：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-0">
<span class="eqno">(2.2.10)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-0" title="Permalink to this equation">¶</a></span>\[score(u, v) = u \cdot v = \sum_{i=1}^{d} u_i v_i\]</div>
<p>其中<span class="math notranslate nohighlight">\(d\)</span>是向量维度,
<span class="math notranslate nohighlight">\(u_i\)</span>和<span class="math notranslate nohighlight">\(v_i\)</span>是向量<span class="math notranslate nohighlight">\(u\)</span>和<span class="math notranslate nohighlight">\(v\)</span>的第<span class="math notranslate nohighlight">\(i\)</span>个分量。这个简单的数学操作背后，蕴含着“语义相似性”的深刻含义——向量空间中的距离反映了用户兴趣与物品特性的匹配程度。</p>
<p>接下来，我们将沿着双塔模型的演进轨迹，从经典的数学基础到现代的深度学习实现，逐一探讨这些里程碑式的工作。</p>
<section id="fm">
<span id="fm-matching-model"></span><h2><span class="section-number">2.2.2.1. </span>FM（因子分解机）：双塔模型的雏形<a class="headerlink" href="#fm" title="Permalink to this heading">¶</a></h2>
<p>虽然因子分解机（Factorization Machine, FM）
<span id="id2">(<a class="reference internal" href="../../chapter_references/references.html#id12" title="Rendle, S. (2010). Factorization machines. 2010 IEEE International conference on data mining (pp. 995–1000).">Rendle, 2010</a>)</span>
诞生于深度学习兴起之前，但它在思想上可以说是双塔模型的雏形。FM的核心贡献在于，它首次将用户-物品的复杂交互，优雅地分解为两个低维向量的内积操作。</p>
<section id="id3">
<h3><span class="section-number">2.2.2.1.1. </span>从交互矩阵到向量内积<a class="headerlink" href="#id3" title="Permalink to this heading">¶</a></h3>
<p>FM模型的完整数学表达式为：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-1">
<span class="eqno">(2.2.11)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-1" title="Permalink to this equation">¶</a></span>\[\hat{y}(\mathbf{x}):=w_{0}+\sum_{i=1}^{n} w_{i} x_{i}+\sum_{i=1}^{n} \sum_{j=i+1}^{n}\left\langle\mathbf{v}_{i}, \mathbf{v}_{j}\right\rangle x_{i} x_{j}\]</div>
<p>这个公式看起来复杂，但其核心思想简单而深刻：每个特征<span class="math notranslate nohighlight">\(i\)</span>都对应一个<span class="math notranslate nohighlight">\(k\)</span>维的隐向量<span class="math notranslate nohighlight">\(\mathbf{v}_i\)</span>，特征间的交互通过这些隐向量的内积<span class="math notranslate nohighlight">\(\langle\mathbf{v}_{i}, \mathbf{v}_{j}\rangle = \sum_{f=1}^{k} v_{i,f} \cdot v_{j,f}\)</span>来建模。</p>
<p>FM的真正巧妙之处在于一个数学变换技巧。原本<span class="math notranslate nohighlight">\(O(n^2)\)</span>复杂度的二阶交互项，可以通过代数运算重写为：</p>
<div class="math notranslate nohighlight" id="equation-eq-fm-cross">
<span class="eqno">(2.2.12)<a class="headerlink" href="#equation-eq-fm-cross" title="Permalink to this equation">¶</a></span>\[\sum_{i=1}^{n} \sum_{j=i+1}^{n}\left\langle\mathbf{v}_{i}, \mathbf{v}_{j}\right\rangle x_{i} x_{j} = \frac{1}{2} \sum_{f=1}^{k}\left(\left(\sum_{i=1}^{n} v_{i, f} x_{i}\right)^{2}-\sum_{i=1}^{n} v_{i, f}^{2} x_{i}^{2}\right)\]</div>
<p>这一变换将计算复杂度从<span class="math notranslate nohighlight">\(O(kn^2)\)</span>降低到<span class="math notranslate nohighlight">\(O(kn)\)</span>，使得FM能够处理大规模稀疏数据。</p>
</section>
<section id="id4">
<h3><span class="section-number">2.2.2.1.2. </span>分解为双塔结构<a class="headerlink" href="#id4" title="Permalink to this heading">¶</a></h3>
<p>虽然FM通过数学变换解决了计算复杂度问题，但在召回任务中，我们还面临另一个挑战：如何高效地为用户从海量候选物品中筛选出最相关的推荐结果？这时就需要考虑将FM分解为双塔结构。</p>
<p>在召回场景下，我们可以将所有特征自然地分为两类：用户侧特征集<span class="math notranslate nohighlight">\(U\)</span>（如用户年龄、性别、历史偏好等）和物品侧特征集<span class="math notranslate nohighlight">\(I\)</span>（如物品类别、价格、品牌等）。</p>
<p>这里有一个关键的发现：当我们为同一个用户推荐不同物品时，用户特征是固定不变的。因此，用户特征内部的交互得分（无论是一阶还是二阶）对所有候选物品都是相同的。既然这部分得分相同，在排序时就可以忽略，我们只需要关注：
1. 物品特征内部的交互得分 2. 用户特征与物品特征之间的交互得分</p>
<p>基于这个思路，我们可以将FM的二阶交互项重新组织：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-2">
<span class="eqno">(2.2.13)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-2" title="Permalink to this equation">¶</a></span>\[\begin{split}\begin{aligned}
&amp; \frac{1}{2} \sum_{f=1}^{k}\left(\left(\sum_{i=1}^{n} v_{i, f} x_{i}\right)^{2}-\sum_{i=1}^{n} v_{i, f}^{2} x_{i}^{2}\right) \\
=&amp; \frac{1}{2} \sum_{f=1}^{k}\left(\left(\sum_{u \in U} v_{u, f} x_{u} + \sum_{t \in I} v_{t, f} x_{t}\right)^{2}-\sum_{u \in U} v_{u, f}^{2} x_{u}^{2} - \sum_{t\in I} v_{t, f}^{2} x_{t}^{2}\right) \\
=&amp; \frac{1}{2} \sum_{f=1}^{k}\left(\left(\sum_{u \in U} v_{u, f} x_{u}\right)^{2} + \left(\sum_{t \in I} v_{t, f} x_{t}\right)^{2} + 2{\sum_{u \in U} v_{u, f} x_{u}}{\sum_{t \in I} v_{t, f} x_{t}} - \sum_{u \in U} v_{u, f}^{2} x_{u}^{2} - \sum_{t \in I} v_{t, f}^{2} x_{t}^{2}\right)
\end{aligned}\end{split}\]</div>
<p>基于前面的分析，我们发现用户特征内部的交互项对所有候选物品都相同，因此可以在召回阶段忽略。这样就可以将FM重新组织，只保留对排序有影响的部分：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-3">
<span class="eqno">(2.2.14)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-3" title="Permalink to this equation">¶</a></span>\[\text{score}_{FM} = \sum_{t \in I} w_{t} x_{t} + \frac{1}{2} \sum_{f=1}^{k}\left(\left(\sum_{t \in I} v_{t, f} x_{t}\right)^{2}  - \sum_{t \in I} v_{t, f}^{2} x_{t}^{2}\right)  + \sum_{f=1}^{k}\left( {\sum_{u \in U} v_{u, f} x_{u}}{\sum_{t \in I} v_{t, f} x_{t}} \right)\]</div>
<p>观察上面的公式，我们发现一个重要的数学结构：最后一项<span class="math notranslate nohighlight">\(\sum_{f=1}^{k}\left( {\sum_{u \in U} v_{u, f} x_{u}}{\sum_{t \in I} v_{t, f} x_{t}} \right)\)</span>实际上是两个向量<span class="math notranslate nohighlight">\(\sum_{u \in U} v_{u} x_{u}\)</span>和<span class="math notranslate nohighlight">\(\sum_{t \in I} v_{t} x_{t}\)</span>的内积。这启发我们将整个匹配分数重新组织为双塔结构：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-4">
<span class="eqno">(2.2.15)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-4" title="Permalink to this equation">¶</a></span>\[\text{score}_{FM} = V_{item} \cdot V_{user}^T\]</div>
<p>通过这种重新组织，我们得到了FM的双塔表示：</p>
<ul class="simple">
<li><p>用户向量：<span class="math notranslate nohighlight">\(V_{user} = [1; \sum_{u \in U} v_{u} x_{u}]\)</span></p></li>
<li><p>物品向量：<span class="math notranslate nohighlight">\(V_{item} = [\sum_{t \in I} w_{t} x_{t} + \frac{1}{2} \sum_{f=1}^{k}((\sum_{t \in I} v_{t, f} x_{t})^{2} - \sum_{t \in I} v_{t, f}^{2} x_{t}^{2}); \sum_{t \in I} v_{t} x_{t}]\)</span></p></li>
</ul>
<p>这里的设计很巧妙：用户向量包含一个常数项1和用户特征的聚合表示，物品向量则包含物品的内部交互信息和物品特征的聚合表示。两个向量通过内积运算，既能捕捉用户-物品之间的交互，又保留了物品内部特征的复杂关系。</p>
<p>这样的分解揭示了一个重要原理：即使是复杂的特征交互模式，也可以通过合适的向量表示和简单的内积运算来实现。</p>
<p><strong>核心代码</strong></p>
<p>FM召回的双塔实现关键在于如何将数学推导转化为实际的向量表示。用户塔构建了包含常数项和特征聚合的向量：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 用户塔：V_user = [1; ∑(v_u * x_u)]</span>
<span class="n">user_concat</span> <span class="o">=</span> <span class="n">Concatenate</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)(</span><span class="n">user_embeddings</span><span class="p">)</span>  <span class="c1"># [batch_size, num_user_features, embedding_dim]</span>
<span class="n">user_embedding_sum</span> <span class="o">=</span> <span class="n">SumPooling</span><span class="p">()(</span><span class="n">user_concat</span><span class="p">)</span>  <span class="c1"># [batch_size, embedding_dim]</span>

<span class="c1"># 构建用户向量：[1; ∑(v_u * x_u)]</span>
<span class="n">ones_vector</span> <span class="o">=</span> <span class="n">OnesLayer</span><span class="p">()(</span><span class="n">user_embedding_sum</span><span class="p">)</span>  <span class="c1"># [batch_size, 1]</span>
<span class="n">user_vector</span> <span class="o">=</span> <span class="n">Concatenate</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)([</span><span class="n">ones_vector</span><span class="p">,</span> <span class="n">user_embedding_sum</span><span class="p">])</span>
</pre></div>
</div>
<p>物品塔则更为复杂，需要计算一阶线性项和FM交互项：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 物品塔：V_item = [∑w_t*x_t + FM_interaction; ∑(v_t * x_t)]</span>
<span class="n">item_concat</span> <span class="o">=</span> <span class="n">Concatenate</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)(</span><span class="n">item_embeddings</span><span class="p">)</span>  <span class="c1"># [batch_size, num_item_features, embedding_dim]</span>
<span class="n">item_embedding_sum</span> <span class="o">=</span> <span class="n">SumPooling</span><span class="p">()(</span><span class="n">item_concat</span><span class="p">)</span>  <span class="c1"># [batch_size, embedding_dim]</span>

<span class="c1"># 计算一阶线性项：∑(w_t * x_t)</span>
<span class="n">item_linear_weights</span> <span class="o">=</span> <span class="n">Dense</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">use_bias</span><span class="o">=</span><span class="kc">False</span><span class="p">)(</span><span class="n">item_embedding_sum</span><span class="p">)</span>

<span class="c1"># 计算FM二阶交互项：0.5 * ((∑v_t*x_t)² - ∑(v_t²*x_t²))</span>
<span class="n">sum_squared</span> <span class="o">=</span> <span class="n">SquareLayer</span><span class="p">()(</span><span class="n">item_embedding_sum</span><span class="p">)</span>
<span class="n">item_squared</span> <span class="o">=</span> <span class="n">SquareLayer</span><span class="p">()(</span><span class="n">item_concat</span><span class="p">)</span>
<span class="n">squared_sum</span> <span class="o">=</span> <span class="n">SumPooling</span><span class="p">()(</span><span class="n">item_squared</span><span class="p">)</span>
<span class="n">fm_interaction_vector</span> <span class="o">=</span> <span class="n">Subtract</span><span class="p">()([</span><span class="n">sum_squared</span><span class="p">,</span> <span class="n">squared_sum</span><span class="p">])</span>
<span class="n">fm_interaction_scalar</span> <span class="o">=</span> <span class="n">SumScalarLayer</span><span class="p">()(</span><span class="n">ScaleLayer</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)(</span><span class="n">fm_interaction_vector</span><span class="p">))</span>

<span class="c1"># 组合为物品向量</span>
<span class="n">first_term</span> <span class="o">=</span> <span class="n">Add</span><span class="p">()([</span><span class="n">item_linear_weights</span><span class="p">,</span> <span class="n">fm_interaction_scalar</span><span class="p">])</span>
<span class="n">item_vector</span> <span class="o">=</span> <span class="n">Concatenate</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)([</span><span class="n">first_term</span><span class="p">,</span> <span class="n">item_embedding_sum</span><span class="p">])</span>
</pre></div>
</div>
<p>最终通过内积计算匹配分数：<code class="docutils literal notranslate"><span class="pre">fm_score</span> <span class="pre">=</span> <span class="pre">Dot(axes=1)([item_vector,</span> <span class="pre">user_vector])</span></code>。这种设计使得物品向量可以离线预计算，用户向量实时计算，从而支持高效的召回检索。</p>
<p><strong>训练和评估</strong></p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span><span class="w"> </span><span class="nn">funrec</span><span class="w"> </span><span class="kn">import</span> <span class="n">run_experiment</span>

<span class="n">run_experiment</span><span class="p">(</span><span class="s1">&#39;fm_recall&#39;</span><span class="p">)</span>
</pre></div>
</div>
<div class="output highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
<span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>
<span class="o">+===============+==============+===========+==========+================+===============+</span>
<span class="o">|</span>        <span class="mf">0.0123</span> <span class="o">|</span>       <span class="mf">0.0073</span> <span class="o">|</span>    <span class="mf">0.0071</span> <span class="o">|</span>   <span class="mf">0.0055</span> <span class="o">|</span>         <span class="mf">0.0012</span> <span class="o">|</span>        <span class="mf">0.0015</span> <span class="o">|</span>
<span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
</pre></div>
</div>
</section>
</section>
<section id="dssm">
<h2><span class="section-number">2.2.2.2. </span>DSSM：深度结构化语义模型<a class="headerlink" href="#dssm" title="Permalink to this heading">¶</a></h2>
<p>虽然FM在数学上优雅地实现了向量分解，但它本质上仍是线性模型，对于复杂的非线性用户-物品关系表达能力有限。深度结构化语义模型（Deep
Structured Semantic Model, DSSM） <span id="id5">(<a class="reference internal" href="../../chapter_references/references.html#id20" title="Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., &amp; Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. Proceedings of the 22nd ACM international conference on Information &amp; Knowledge Management (pp. 2333–2338).">Huang <em>et al.</em>, 2013</a>)</span>
的出现，将双塔模型的表达能力推向了新的高度——通过深度神经网络替代线性变换，实现了更强的特征学习和表示能力。其核心思想是通过深度神经网络将用户和物品映射到共同的语义空间中，通过向量间的相似度计算来衡量匹配程度。</p>
<figure class="align-default" id="id12">
<span id="dssm-architecture"></span><a class="reference internal image-reference" href="../../_images/dssm_architecture.svg"><img alt="../../_images/dssm_architecture.svg" src="../../_images/dssm_architecture.svg" width="300px" /></a>
<figcaption>
<p><span class="caption-number">图2.2.5 </span><span class="caption-text">DSSM双塔架构</span><a class="headerlink" href="#id12" title="Permalink to this image">¶</a></p>
</figcaption>
</figure>
<section id="id6">
<h3><span class="section-number">2.2.2.2.1. </span>推荐中的双塔架构<a class="headerlink" href="#id6" title="Permalink to this heading">¶</a></h3>
<p>在推荐系统中，DSSM的架构包括两个核心部分：用户塔和物品塔，每个塔都是独立的DNN结构。用户特征（如历史行为、人口统计学信息等）经过用户塔处理后输出用户Embedding，物品特征（如ID、类别、属性等）经过物品塔处理后输出物品Embedding。两个Embedding的维度必须保持一致，以便进行后续的相似度计算。</p>
<p>相比FM的线性组合，DSSM的深度结构能够使用户侧和物品侧的特征各自在塔内进行复杂的非线性变换，但两塔之间的交互仅在最终的内积计算时发生。这种设计带来了显著的工程优势——物品向量可以离线预计算，用户向量可以实时计算，然后通过高效的ANN检索完成召回。</p>
</section>
<section id="id7">
<h3><span class="section-number">2.2.2.2.2. </span>多分类训练范式<a class="headerlink" href="#id7" title="Permalink to this heading">¶</a></h3>
<p>DSSM将召回任务视为一个极端多分类问题，将物料库中的所有物品看作不同的类别。模型的目标是最大化用户对正样本物品的预测概率：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-5">
<span class="eqno">(2.2.16)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-5" title="Permalink to this equation">¶</a></span>\[P(y|x,\theta) = \frac{e^{s(x,y)}}{\sum_{j\in M}e^{s(x,y_j)}}\]</div>
<p>这里<span class="math notranslate nohighlight">\(s(x,y)\)</span>表示用户<span class="math notranslate nohighlight">\(x\)</span>和物品<span class="math notranslate nohighlight">\(y\)</span>的相似度分数，<span class="math notranslate nohighlight">\(P(y|x,\theta)\)</span>表示匹配概率，<span class="math notranslate nohighlight">\(M\)</span>表示整个物料库。由于物料库规模庞大，直接计算这个Softmax在计算上不可行，因此实际训练时采用负采样技术，为每个正样本采样一定数量的负样本来近似计算。</p>
</section>
<section id="id8">
<h3><span class="section-number">2.2.2.2.3. </span>双塔模型的细节<a class="headerlink" href="#id8" title="Permalink to this heading">¶</a></h3>
<p>除了相对简单的模型结构外，双塔模型在实际应用中的一些关键细节同样值得深入探讨。这些细节往往决定了模型的最终效果，<span id="id9">(<a class="reference internal" href="../../chapter_references/references.html#id24" title="Yi, X., Yang, J., Hong, L., Cheng, D. Z., Heldt, L., Kumthekar, A., … Chi, E. (2019). Sampling-bias-corrected neural modeling for large corpus item recommendations. Proceedings of the 13th ACM conference on recommender systems (pp. 269–277).">Yi <em>et al.</em>, 2019</a>)</span>
等研究对此进行了分析。</p>
<p><strong>向量归一化</strong>：对用户塔和物品塔输出的Embedding进行L2归一化：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-6">
<span class="eqno">(2.2.17)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-6" title="Permalink to this equation">¶</a></span>\[u \leftarrow \frac{u}{||u||_2}, \quad v \leftarrow \frac{v}{||v||_2}\]</div>
<p>归一化的核心作用是解决向量点积的非度量性问题。原始的向量点积不满足三角不等式，可能导致“距离”计算的不一致性。例如，对于三个点<span class="math notranslate nohighlight">\(A=(10,0)\)</span>、<span class="math notranslate nohighlight">\(B=(0,10)\)</span>、<span class="math notranslate nohighlight">\(C=(11,0)\)</span>，使用点积计算会得到<span class="math notranslate nohighlight">\(\text{dist}(A,B) &lt; \text{dist}(A,C)\)</span>，但这与直观的几何距离不符。</p>
<p>通过归一化，向量点积被转化为欧式距离的度量形式。对于归一化向量<span class="math notranslate nohighlight">\(u\)</span>和<span class="math notranslate nohighlight">\(v\)</span>，它们的欧式距离为：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-7">
<span class="eqno">(2.2.18)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-7" title="Permalink to this equation">¶</a></span>\[||u - v|| = \sqrt{2-2\langle u,v \rangle}\]</div>
<p>这种转换的关键意义在于<strong>训练与检索的一致性</strong>：模型训练时使用的相似度计算（归一化后的点积）与线上ANN检索系统使用的距离度量（欧式距离）本质上是等价的。这确保了离线训练学到的向量关系能够在线上检索中得到正确体现，避免了训练-服务不一致的问题。</p>
<p><strong>温度系数调节</strong>：在归一化后的向量计算内积后，除以温度系数<span class="math notranslate nohighlight">\(\tau\)</span>：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-8">
<span class="eqno">(2.2.19)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-8" title="Permalink to this equation">¶</a></span>\[s(u,v) = \frac{\langle u,v \rangle}{\tau}\]</div>
<p>这里的温度系数<span class="math notranslate nohighlight">\(\tau\)</span>看起来是个简单的除法操作，但实际上它对模型的训练效果有着深远的影响。从数学角度来看，温度系数本质上是在缩放logits，进而改变Softmax函数的输出分布形状。当我们设置<span class="math notranslate nohighlight">\(\tau &lt; 1\)</span>时，相似度的差异会被放大，这意味着模型会对高分样本给出更高的概率，预测变得更加“确定”；相反，当<span class="math notranslate nohighlight">\(\tau &gt; 1\)</span>时，分布会变得更加平滑，模型的预测也更加保守。</p>
<p><strong>核心代码</strong></p>
<p>DSSM的实现核心在于构建独立的用户塔和物品塔，每个塔都是一个深度神经网络：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 拼接用户侧和物品侧特征</span>
<span class="n">user_feature</span> <span class="o">=</span> <span class="n">concat_group_embedding</span><span class="p">(</span>
    <span class="n">group_embedding_feature_dict</span><span class="p">,</span> <span class="s2">&quot;user&quot;</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">flatten</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)</span>  <span class="c1"># B x (N*D)</span>
<span class="n">item_feature</span> <span class="o">=</span> <span class="n">concat_group_embedding</span><span class="p">(</span>
    <span class="n">group_embedding_feature_dict</span><span class="p">,</span> <span class="s2">&quot;item&quot;</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">flatten</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)</span>  <span class="c1"># B x (N*D)</span>

<span class="c1"># 构建用户塔和物品塔（深度神经网络）</span>
<span class="n">user_tower</span> <span class="o">=</span> <span class="n">DNNs</span><span class="p">(</span>
    <span class="n">units</span><span class="o">=</span><span class="n">dnn_units</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">&quot;tanh&quot;</span><span class="p">,</span> <span class="n">dropout_rate</span><span class="o">=</span><span class="n">dropout_rate</span><span class="p">,</span> <span class="n">use_bn</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)(</span><span class="n">user_feature</span><span class="p">)</span>
<span class="n">item_tower</span> <span class="o">=</span> <span class="n">DNNs</span><span class="p">(</span>
    <span class="n">units</span><span class="o">=</span><span class="n">dnn_units</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">&quot;tanh&quot;</span><span class="p">,</span> <span class="n">dropout_rate</span><span class="o">=</span><span class="n">dropout_rate</span><span class="p">,</span> <span class="n">use_bn</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)(</span><span class="n">item_feature</span><span class="p">)</span>
</pre></div>
</div>
<p>关键的向量归一化和相似度计算：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># L2归一化：确保训练与检索的一致性</span>
<span class="n">user_embedding</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Lambda</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">l2_normalize</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))(</span>
    <span class="n">user_tower</span>
<span class="p">)</span>
<span class="n">item_embedding</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Lambda</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">l2_normalize</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))(</span>
    <span class="n">item_tower</span>
<span class="p">)</span>

<span class="c1"># 计算余弦相似度（归一化向量的点积）</span>
<span class="n">cosine_similarity</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dot</span><span class="p">(</span><span class="n">axes</span><span class="o">=</span><span class="mi">1</span><span class="p">)([</span><span class="n">user_embedding</span><span class="p">,</span> <span class="n">item_embedding</span><span class="p">])</span>
</pre></div>
</div>
<p>这种设计使得用户和物品的表示完全独立，支持离线预计算物品向量并存储在ANN索引中，实现毫秒级的召回响应。</p>
<p><strong>训练和评估</strong></p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">run_experiment</span><span class="p">(</span><span class="s1">&#39;dssm&#39;</span><span class="p">)</span>
</pre></div>
</div>
<div class="output highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
<span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>
<span class="o">+===============+==============+===========+==========+================+===============+</span>
<span class="o">|</span>        <span class="mf">0.0161</span> <span class="o">|</span>       <span class="mf">0.0131</span> <span class="o">|</span>    <span class="mf">0.0083</span> <span class="o">|</span>   <span class="mf">0.0074</span> <span class="o">|</span>         <span class="mf">0.0016</span> <span class="o">|</span>        <span class="mf">0.0026</span> <span class="o">|</span>
<span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
</pre></div>
</div>
</section>
</section>
<section id="youtubednn">
<h2><span class="section-number">2.2.2.3. </span>YouTubeDNN：从匹配到预测用户下一行为<a class="headerlink" href="#youtubednn" title="Permalink to this heading">¶</a></h2>
<p>YouTube深度神经网络推荐系统 <span id="id10">(<a class="reference internal" href="../../chapter_references/references.html#id28" title="Covington, P., Adams, J., &amp; Sargin, E. (2016). Deep neural networks for youtube recommendations. Proceedings of the 10th ACM conference on recommender systems (pp. 191–198).">Covington <em>et al.</em>, 2016</a>)</span>
代表了双塔模型演进的一个重要里程碑。YouTubeDNN在架构上延续了双塔设计，但引入了一个关键的思想转变：将召回任务重新定义为“预测用户下一个会观看的视频”。</p>
<figure class="align-default" id="id13">
<span id="youtubednn-candidate"></span><a class="reference internal image-reference" href="../../_images/youtubednn_candidate.png"><img alt="../../_images/youtubednn_candidate.png" src="../../_images/youtubednn_candidate.png" style="width: 500px;" /></a>
<figcaption>
<p><span class="caption-number">图2.2.6 </span><span class="caption-text">YouTubeDNN候选生成模型架构</span><a class="headerlink" href="#id13" title="Permalink to this image">¶</a></p>
</figcaption>
</figure>
<p>YouTubeDNN采用了“非对称”的双塔架构：用户塔集成了观看历史、搜索历史、人口统计学特征等多模态信息，用户观看的视频ID通过嵌入层映射后进行平均池化聚合，模型还引入了“Example
Age”特征来建模内容新鲜度的影响；物品塔则相对简化，本质上是一个巨大的嵌入矩阵，每个视频对应一个可学习的向量，避免了复杂的物品特征工程。</p>
<p>这种“预测下一个观看视频”的任务设定，本质上类似于NLP中的next
token预测，可以自然地建模为一个极端多分类问题：</p>
<div class="math notranslate nohighlight" id="equation-chapter-1-retrieval-2-embedding-2-u2i-9">
<span class="eqno">(2.2.20)<a class="headerlink" href="#equation-chapter-1-retrieval-2-embedding-2-u2i-9" title="Permalink to this equation">¶</a></span>\[P(w_t=i|U,C) = \frac{e^{v_i \cdot u}}{\sum_{j \in V} e^{v_j \cdot u}}\]</div>
<p>这里<span class="math notranslate nohighlight">\(w_t\)</span>表示用户在时间<span class="math notranslate nohighlight">\(t\)</span>观看的视频，<span class="math notranslate nohighlight">\(U\)</span>是用户特征，<span class="math notranslate nohighlight">\(C\)</span>是上下文信息，<span class="math notranslate nohighlight">\(V\)</span>是整个视频库。由于视频库规模庞大，直接计算全量Softmax不可行，因此采用Sampled
Softmax进行高效训练。</p>
<section id="id11">
<h3><span class="section-number">2.2.2.3.1. </span>关键的工程技巧<a class="headerlink" href="#id11" title="Permalink to this heading">¶</a></h3>
<p>YouTubeDNN的成功不仅来自于模型设计，更来自于一系列精心设计的工程技巧：</p>
<p><strong>非对称的时序分割</strong>：传统协同过滤通常随机保留验证项目，但这种做法存在未来信息泄露问题。视频消费具有明显的不对称模式——剧集通常按顺序观看，用户往往从热门内容开始逐步深入小众领域。因此，YouTubeDNN采用时序分割策略：对于作为预测目标的用户观看记录，只使用该目标之前的历史行为作为输入特征。这种“回滚”机制更符合真实的推荐场景。</p>
<figure class="align-default" id="id14">
<span id="youtubednn-temporal-split"></span><a class="reference internal image-reference" href="../../_images/youtubednn_temporal_split.png"><img alt="../../_images/youtubednn_temporal_split.png" src="../../_images/youtubednn_temporal_split.png" style="width: 600px;" /></a>
<figcaption>
<p><span class="caption-number">图2.2.7 </span><span class="caption-text">非对称共同观看模式</span><a class="headerlink" href="#id14" title="Permalink to this image">¶</a></p>
</figcaption>
</figure>
<p><strong>负采样策略</strong>：为了高效处理数百万类别的Softmax，模型采用重要性采样技术，每次只对数千个负样本进行计算，将训练速度提升了100多倍。</p>
<p><strong>用户样本均衡</strong>：为每个用户生成固定数量的训练样本，避免高活跃用户主导模型学习。这个看似简单的技巧，对提升长尾用户的推荐效果至关重要。</p>
<p>YouTubeDNN的成功在于建立了一套可扩展、可工程化的推荐系统范式——训练时使用复杂的多分类目标和丰富的用户特征，服务时通过预计算物品向量和实时计算用户向量，配合高效的ANN检索完成召回。这种设计实现了训练复杂度和服务效率的有效平衡，至今仍被广泛借鉴。</p>
<p><strong>核心代码</strong></p>
<p>YouTubeDNN的用户塔设计体现了“非对称”的思想，它整合了多种用户特征和历史行为序列：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 整合用户特征和历史行为序列</span>
<span class="n">user_feature_embedding</span> <span class="o">=</span> <span class="n">concat_group_embedding</span><span class="p">(</span>
    <span class="n">group_embedding_feature_dict</span><span class="p">,</span> <span class="s2">&quot;user_dnn&quot;</span>
<span class="p">)</span>  <span class="c1"># B x (D * N)</span>

<span class="k">if</span> <span class="s2">&quot;raw_hist_seq&quot;</span> <span class="ow">in</span> <span class="n">group_embedding_feature_dict</span><span class="p">:</span>
    <span class="n">hist_seq_embedding</span> <span class="o">=</span> <span class="n">concat_group_embedding</span><span class="p">(</span>
        <span class="n">group_embedding_feature_dict</span><span class="p">,</span> <span class="s2">&quot;raw_hist_seq&quot;</span>
    <span class="p">)</span>  <span class="c1"># B x D</span>
    <span class="n">user_dnn_inputs</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span>
        <span class="p">[</span><span class="n">user_feature_embedding</span><span class="p">,</span> <span class="n">hist_seq_embedding</span><span class="p">],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span>
    <span class="p">)</span>  <span class="c1"># B x (D * N + D)</span>
<span class="k">else</span><span class="p">:</span>
    <span class="n">user_dnn_inputs</span> <span class="o">=</span> <span class="n">user_feature_embedding</span>

<span class="c1"># 构建用户塔：输出归一化的用户向量</span>
<span class="n">user_dnn_output</span> <span class="o">=</span> <span class="n">DNNs</span><span class="p">(</span>
    <span class="n">units</span><span class="o">=</span><span class="n">dnn_units</span> <span class="o">+</span> <span class="p">[</span><span class="n">emb_dim</span><span class="p">],</span> <span class="n">activation</span><span class="o">=</span><span class="s2">&quot;relu&quot;</span><span class="p">,</span> <span class="n">use_bn</span><span class="o">=</span><span class="kc">False</span>
<span class="p">)(</span><span class="n">user_dnn_inputs</span><span class="p">)</span>
<span class="n">user_dnn_output</span> <span class="o">=</span> <span class="n">L2NormalizeLayer</span><span class="p">(</span><span class="n">axis</span><span class="o">=-</span><span class="mi">1</span><span class="p">)(</span><span class="n">user_dnn_output</span><span class="p">)</span>
</pre></div>
</div>
<p>物品塔则采用简化设计，直接使用物品Embedding表：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 物品Embedding表（从特征列配置中获取）</span>
<span class="n">item_embedding_table</span> <span class="o">=</span> <span class="n">embedding_table_dict</span><span class="p">[</span><span class="n">label_name</span><span class="p">]</span>

<span class="c1"># 为评估构建物品模型</span>
<span class="n">output_item_embedding</span> <span class="o">=</span> <span class="n">SqueezeLayer</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)(</span>
    <span class="n">item_embedding_table</span><span class="p">(</span><span class="n">input_layer_dict</span><span class="p">[</span><span class="n">label_name</span><span class="p">])</span>
<span class="p">)</span>
<span class="n">output_item_embedding</span> <span class="o">=</span> <span class="n">L2NormalizeLayer</span><span class="p">(</span><span class="n">axis</span><span class="o">=-</span><span class="mi">1</span><span class="p">)(</span><span class="n">output_item_embedding</span><span class="p">)</span>
</pre></div>
</div>
<p>训练时采用Sampled
Softmax优化，将百万级的多分类问题转化为高效的采样学习：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># 构建采样softmax层</span>
<span class="n">sampled_softmax_layer</span> <span class="o">=</span> <span class="n">SampledSoftmaxLayer</span><span class="p">(</span><span class="n">item_vocab_size</span><span class="p">,</span> <span class="n">neg_sample</span><span class="p">,</span> <span class="n">emb_dim</span><span class="p">)</span>
<span class="n">output</span> <span class="o">=</span> <span class="n">sampled_softmax_layer</span><span class="p">([</span>
    <span class="n">item_embedding_table</span><span class="o">.</span><span class="n">embeddings</span><span class="p">,</span>
    <span class="n">user_dnn_output</span><span class="p">,</span>
    <span class="n">input_layer_dict</span><span class="p">[</span><span class="n">label_name</span><span class="p">]</span>
<span class="p">])</span>
</pre></div>
</div>
<p>这种设计的核心优势在于：用户塔可以根据业务需求灵活扩展特征和模型复杂度，而物品塔保持简洁高效，易于离线预计算和实时检索。</p>
<p><strong>训练和评估</strong></p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">run_experiment</span><span class="p">(</span><span class="s1">&#39;youtubednn&#39;</span><span class="p">)</span>
</pre></div>
</div>
<div class="output highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
<span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">hit_rate</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">ndcg</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">10</span> <span class="o">|</span>   <span class="n">precision</span><span class="o">@</span><span class="mi">5</span> <span class="o">|</span>
<span class="o">+===============+==============+===========+==========+================+===============+</span>
<span class="o">|</span>        <span class="mf">0.0303</span> <span class="o">|</span>       <span class="mf">0.0219</span> <span class="o">|</span>    <span class="mf">0.0169</span> <span class="o">|</span>   <span class="mf">0.0142</span> <span class="o">|</span>          <span class="mf">0.003</span> <span class="o">|</span>        <span class="mf">0.0044</span> <span class="o">|</span>
<span class="o">+---------------+--------------+-----------+----------+----------------+---------------+</span>
</pre></div>
</div>
</section>
</section>
</section>


        </div>
        <div class="side-doc-outline">
            <div class="side-doc-outline--content"> 
<div class="localtoc">
    <p class="caption">
      <span class="caption-text">Table Of Contents</span>
    </p>
    <ul>
<li><a class="reference internal" href="#">2.2.2. U2I召回</a><ul>
<li><a class="reference internal" href="#fm">2.2.2.1. FM（因子分解机）：双塔模型的雏形</a><ul>
<li><a class="reference internal" href="#id3">2.2.2.1.1. 从交互矩阵到向量内积</a></li>
<li><a class="reference internal" href="#id4">2.2.2.1.2. 分解为双塔结构</a></li>
</ul>
</li>
<li><a class="reference internal" href="#dssm">2.2.2.2. DSSM：深度结构化语义模型</a><ul>
<li><a class="reference internal" href="#id6">2.2.2.2.1. 推荐中的双塔架构</a></li>
<li><a class="reference internal" href="#id7">2.2.2.2.2. 多分类训练范式</a></li>
<li><a class="reference internal" href="#id8">2.2.2.2.3. 双塔模型的细节</a></li>
</ul>
</li>
<li><a class="reference internal" href="#youtubednn">2.2.2.3. YouTubeDNN：从匹配到预测用户下一行为</a><ul>
<li><a class="reference internal" href="#id11">2.2.2.3.1. 关键的工程技巧</a></li>
</ul>
</li>
</ul>
</li>
</ul>

</div>
            </div>
        </div>

      <div class="clearer"></div>
    </div><div class="pagenation">
     <a id="button-prev" href="1.i2i.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="P">
         <i class="pagenation-arrow-L fas fa-arrow-left fa-lg"></i>
         <div class="pagenation-text">
            <span class="pagenation-direction">Previous</span>
            <div>2.2.1. I2I召回</div>
         </div>
     </a>
     <a id="button-next" href="3.summary.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="N">
         <i class="pagenation-arrow-R fas fa-arrow-right fa-lg"></i>
        <div class="pagenation-text">
            <span class="pagenation-direction">Next</span>
            <div>2.2.3. 总结</div>
        </div>
     </a>
  </div>
        
        </main>
    </div>
  </body>
</html>